Analysis of GLDS-218 from NASA GeneLab

This R markdown file was auto-generated by the iDEP website Using iDEP 0.91, originally by Steven

Ge SX, Son EW, Yao R: iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinformatics 2018, 19(1):534. PMID:30567491

1. Read data

First we set up the working directory to where the files are saved.

 setwd('~/Documents/HTML_R/GLDS218')

R packages and iDEP core Functions. Users can also download the iDEP_core_functions.R file. Many R packages needs to be installed first. This may take hours. Each of these packages took years to develop.So be a patient thief. Sometimes dependencies needs to be installed manually. If you are using an older version of R, and having trouble with package installation, try un-install the current version of R, delete all folders and files (C:/Program Files/R/R-3.4.3), and reinstall from scratch.

 if(file.exists('iDEP_core_functions.R'))
    source('iDEP_core_functions.R') else 
    source('https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/iDEP_core_functions.R') 

We are using the downloaded gene expression file where gene IDs has been converted to Ensembl gene IDs. This is because the ID conversion database is too large to download. You can use your original file if your file uses Ensembl ID, or you do not want to use the pathway files available in iDEP (or it is not available).

 inputFile <- 'GLDS218_Expression.csv'
 sampleInfoFile <- 'GLDS218_Sampleinfo.csv'
 gldsMetadataFile <- 'GLDS218_Metadata.csv'
 geneInfoFile <- 'Arabidopsis_thaliana__athaliana_eg_gene_GeneInfo.csv' #Gene symbols, location etc. 
 geneSetFile <- 'Arabidopsis_thaliana__athaliana_eg_gene.db'  # pathway database in SQL; can be GMT format 
 STRING10_speciesFile <- 'https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/STRING10_species.csv' 

Parameters for reading data

 input_missingValue <- 'geneMedian' #Missing values imputation method
 input_dataFileFormat <- 1  #1- read counts, 2 FKPM/RPKM or DNA microarray
 input_minCounts <- 0.5 #Min counts
 input_NminSamples <- 1 #Minimum number of samples 
 input_countsLogStart <- 4  #Pseudo count for log CPM
 input_CountsTransform <- 1 #Methods for data transformation of counts. 1-EdgeR's logCPM 2-VST, 3-rlog 
readMetadata.out <- readMetadata(gldsMetadataFile)
library(knitr)   #  install if needed. for showing tables with kable
library(kableExtra)
kable( readMetadata.out ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%")
AthaWsleafFLTRep1RNAseqRNAseq AthaWsleafFLTRep2RNAseqRNAseq AthaWsleafFLTRep3RNAseqRNAseq AthaWsleafGCRep1RNAseqRNAseq AthaWsleafGCRep2RNAseqRNAseq AthaWsleafGCRep3RNAseqRNAseq AthaWsrootFLTRep1RNAseqRNAseq AthaWsrootFLTRep2RNAseqRNAseq AthaWsrootFLTRep3RNAseqRNAseq AthaWsrootGCRep1RNAseqRNAseq AthaWsrootGCRep2RNAseqRNAseq AthaWsrootGCRep3RNAseqRNAseq
X Atha.Ws.leaf.FLT.Rep1 Atha.Ws.leaf.FLT.Rep2 Atha.Ws.leaf.FLT.Rep3 Atha.Ws.leaf.GC.Rep1 Atha.Ws.leaf.GC.Rep2 Atha.Ws.leaf.GC.Rep3 Atha.Ws.root.FLT.Rep1 Atha.Ws.root.FLT.Rep2 Atha.Ws.root.FLT.Rep3 Atha.Ws.root.GC.Rep1 Atha.Ws.root.GC.Rep2 Atha.Ws.root.GC.Rep3
Sample.Name Atha_Ws_leaf_FLT_Rep1 Atha_Ws_leaf_FLT_Rep2 Atha_Ws_leaf_FLT_Rep3 Atha_Ws_leaf_GC_Rep1 Atha_Ws_leaf_GC_Rep2 Atha_Ws_leaf_GC_Rep3 Atha_Ws_root_FLT_Rep1 Atha_Ws_root_FLT_Rep2 Atha_Ws_root_FLT_Rep3 Atha_Ws_root_GC_Rep1 Atha_Ws_root_GC_Rep2 Atha_Ws_root_GC_Rep3
GLDS 218 218 218 218 218 218 218 218 218 218 218 218
Accession GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218 GLDS-218
Hardware VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish VEGGIE: Petri dish
Tissue Shoots Shoots Shoots Shoots Shoots Shoots Roots Roots Roots Roots Roots Roots
Age 11 days 11 days 11 days 11 days 11 days 11 days 11 days 11 days 11 days 11 days 11 days 11 days
Organism Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana Arabidopsis thaliana
Ecotype WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0 WS-0
Genotype WT WT WT WT WT WT WT WT WT WT WT WT
Variety WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT WS-0 WT
Radiation Cosmic radiation Cosmic radiation Cosmic radiation Background Earth Background Earth Background Earth Cosmic radiation Cosmic radiation Cosmic radiation Background Earth Background Earth Background Earth
Gravity Microgravity Microgravity Microgravity Terrestrial Terrestrial Terrestrial Microgravity Microgravity Microgravity Terrestrial Terrestrial Terrestrial
Developmental 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots 11 day old seedling roots
Time.series.or.Concentration.gradient Single time point Single time point Single time point Single time point Single time point Single time point Single time point Single time point Single time point Single time point Single time point Single time point
Light Light Light Light Light Light Light Light Light Light Light Light Light
Assay..RNAseq. RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling RNAseq Transcription Profiling & methylation profiling
Temperature 22-24 22-24 22-24 22-24 22-24 22-24 22-24 22-24 22-24 22-24 22-24 22-24
Treatment.type Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana Epigenomics in an extraterrestrial environment: organ-specific alteration of DNA methylation and gene expression elicited by spaceflight in Arabidopsis thaliana
Treatment.intensity x x x x x x x x x x x x
Treament.timing x x x x x x x x x x x x
Preservation.Method. RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater RNAlater
 readData.out <- readData(inputFile) 
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
   kable( head(readData.out$data) ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
leaf_FLT_Rep1 leaf_FLT_Rep2 leaf_FLT_Rep3 leaf_GC_Rep1 leaf_GC_Rep2 leaf_GC_Rep3 root_FLT_Rep1 root_FLT_Rep2 root_FLT_Rep3 root_GC_Rep1 root_GC_Rep2 root_GC_Rep3
AT2G39730 17.78293 18.17313 18.16606 10.717916 11.207033 10.074624 18.40095 17.71456 17.77562 10.018001 10.011747 10.495878
AT5G38410 17.37619 17.61313 17.61001 8.494508 9.108077 7.906913 17.83506 17.43857 17.46351 7.668614 8.160298 8.434496
AT2G34420 15.79206 17.26853 17.10717 11.446016 11.911281 10.679680 17.10007 17.46595 17.29179 10.615707 10.424001 10.360353
AT1G29910 15.55316 16.85377 16.79474 10.222624 10.860985 9.602112 16.87975 17.25823 17.06765 9.330598 9.289041 9.214425
AT3G09260 11.07122 11.10264 10.55219 16.954220 17.024522 16.568068 10.03219 10.36804 10.20239 16.758730 16.893935 16.743435
AT3G47470 15.62282 16.76636 16.77697 10.653710 11.075213 10.108569 16.94608 17.08041 16.93297 9.961969 10.041860 10.037081
 readSampleInfo.out <- readSampleInfo(sampleInfoFile) 
 kable( readSampleInfo.out ) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Gravity Tissue
leaf_FLT_Rep1 Microgravity Shoots
leaf_FLT_Rep2 Microgravity Shoots
leaf_FLT_Rep3 Microgravity Shoots
leaf_GC_Rep1 Terrestrial Shoots
leaf_GC_Rep2 Terrestrial Shoots
leaf_GC_Rep3 Terrestrial Shoots
root_FLT_Rep1 Microgravity Roots
root_FLT_Rep2 Microgravity Roots
root_FLT_Rep3 Microgravity Roots
root_GC_Rep1 Terrestrial Roots
root_GC_Rep2 Terrestrial Roots
root_GC_Rep3 Terrestrial Roots
 input_selectOrg ="NEW" 
 input_selectGO <- 'GOBP'   #Gene set category 
 input_noIDConversion = TRUE  
 allGeneInfo.out <- geneInfo(geneInfoFile) 
 converted.out = NULL 
 convertedData.out <- convertedData()    
 nGenesFilter()  
## [1] "16156 genes in 12 samples. 16074  genes passed filter.\n Original gene IDs used."
 convertedCounts.out <- convertedCounts()  # converted counts, just for compatibility 

2. Pre-process

# Read counts per library 
 parDefault = par() 
 par(mar=c(12,4,2,2)) 
 # barplot of total read counts
 x <- readData.out$rawCounts
 groups = as.factor( detectGroups(colnames(x ) ) )
 if(nlevels(groups)<=1 | nlevels(groups) >20 )  
  col1 = 'green'  else
  col1 = rainbow(nlevels(groups))[ groups ]             
         
 barplot( colSums(x)/1e6, 
        col=col1,las=3, main="Total read counts (millions)")  

 readCountsBias()  # detecting bias in sequencing depth 
## [1] 4.020064e-06
## [1] 4.727448e-08
## [1] 0.8010792
## [1] "Warning! Sequencing depth bias detected. Total read counts are significantly different among sample groups (p= 4.02e-06 ) based on ANOVA.  Total read counts seem to be correlated with factor Gravity (p= 4.73e-08 ).  "
 # Box plot 
 x = readData.out$data 
 boxplot(x, las = 2, col=col1,
    ylab='Transformed expression levels',
    main='Distribution of transformed data') 

 #Density plot 
 par(parDefault) 
## Warning in par(parDefault): graphical parameter "cin" cannot be set
## Warning in par(parDefault): graphical parameter "cra" cannot be set
## Warning in par(parDefault): graphical parameter "csi" cannot be set
## Warning in par(parDefault): graphical parameter "cxy" cannot be set
## Warning in par(parDefault): graphical parameter "din" cannot be set
## Warning in par(parDefault): graphical parameter "page" cannot be set
 densityPlot()       

 # Scatter plot of the first two samples 
 plot(x[,1:2],xlab=colnames(x)[1],ylab=colnames(x)[2], 
    main='Scatter plot of first two samples') 

 ####plot gene or gene family
 input_selectOrg ="BestMatch" 
 input_geneSearch <- 'HOXA' #Gene ID for searching 
 genePlot()  
## NULL
 input_useSD <- 'FALSE' #Use standard deviation instead of standard error in error bar? 
 geneBarPlotError()       
## NULL

3. Heatmap

 # hierarchical clustering tree
 x <- readData.out$data
 maxGene <- apply(x,1,max)
 # remove bottom 25% lowly expressed genes, which inflate the PPC
 x <- x[which(maxGene > quantile(maxGene)[1] ) ,] 
 plot(as.dendrogram(hclust2( dist2(t(x)))), ylab="1 - Pearson C.C.", type = "rectangle") 

 #Correlation matrix
 input_labelPCC <- TRUE #Show correlation coefficient? 
 correlationMatrix() 

 # Parameters for heatmap
 input_nGenes <- 1000   #Top genes for heatmap
 input_geneCentering <- TRUE    #centering genes ?
 input_sampleCentering <- FALSE #Center by sample?
 input_geneNormalize <- FALSE   #Normalize by gene?
 input_sampleNormalize <- FALSE #Normalize by sample?
 input_noSampleClustering <- FALSE  #Use original sample order
 input_heatmapCutoff <- 4   #Remove outliers beyond number of SDs 
 input_distFunctions <- 1   #which distant funciton to use
 input_hclustFunctions <- 1 #Linkage type
 input_heatColors1 <- 1 #Colors
 input_selectFactorsHeatmap <- 'Gravity'    #Sample coloring factors 
 png('heatmap.png', width = 10, height = 15, units = 'in', res = 300) 
 staticHeatmap() 
 dev.off()  
## png 
##   2

[heatmap] (heatmap.png)

 heatmapPlotly() # interactive heatmap using Plotly 

4. K-means clustering

 input_nGenesKNN <- 2000    #Number of genes fro k-Means
 input_nClusters <- 4   #Number of clusters 
 maxGeneClustering = 12000
 input_kmeansNormalization <- 'geneMean'    #Normalization
 input_KmeansReRun <- 0 #Random seed 

 distributionSD()  #Distribution of standard deviations 

 KmeansNclusters()  #Number of clusters 

 Kmeans.out = Kmeans()   #Running K-means 
 KmeansHeatmap()   #Heatmap for k-Means 

 #Read gene sets for enrichment analysis 
 sqlite  <- dbDriver('SQLite')
 input_selectGO3 <- 'GOBP'  #Gene set category
 input_minSetSize <- 15 #Min gene set size
 input_maxSetSize <- 2000   #Max gene set size 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO3,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 # Alternatively, users can use their own GMT files by
 #GeneSets.out <- readGMTRobust('somefile.GMT')  
 results <- KmeansGO()  #Enrichment analysis for k-Means clusters   
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Cluster adj.Pval Genes Pathways
A 3.40e-120 96 Photosynthesis
2.11e-65 53 Photosynthesis, light reaction
4.80e-47 63 Generation of precursor metabolites and energy
3.88e-35 92 Organonitrogen compound biosynthetic process
1.81e-32 82 Oxidation-reduction process
3.80e-29 23 Photosynthetic electron transport chain
2.12e-28 20 Photosynthesis, light harvesting
1.08e-27 91 Response to abiotic stimulus
1.58e-26 53 Response to light stimulus
5.30e-26 35 Response to cytokinin
B 1.01e-52 75 Plastid organization
4.02e-51 178 Response to abiotic stimulus
2.64e-36 141 Small molecule metabolic process
3.79e-36 54 Chloroplast organization
1.19e-31 119 Oxidation-reduction process
4.84e-29 120 Response to oxygen-containing compound
1.55e-27 100 Organic acid metabolic process
2.94e-27 99 Oxoacid metabolic process
5.86e-27 132 Response to organic substance
2.37e-26 94 Transmembrane transport
C 8.43e-19 41 Oxidation-reduction process
1.06e-18 16 Antibiotic catabolic process
1.06e-18 19 Cellular response to toxic substance
1.06e-18 20 Detoxification
1.16e-18 15 Hydrogen peroxide catabolic process
5.36e-18 18 Cellular detoxification
4.54e-17 17 Cellular oxidant detoxification
5.09e-17 15 Hydrogen peroxide metabolic process
2.11e-16 21 Response to toxic substance
2.96e-16 15 Cofactor catabolic process
D 7.69e-77 134 Response to inorganic substance
1.16e-68 86 Response to cadmium ion
1.87e-68 181 Response to abiotic stimulus
1.27e-67 95 Response to metal ion
7.98e-57 153 Small molecule metabolic process
3.75e-55 148 Organonitrogen compound biosynthetic process
1.15e-51 92 Response to osmotic stress
5.54e-50 86 Response to salt stress
5.21e-46 113 Oxoacid metabolic process
1.17e-45 113 Organic acid metabolic process
 input_seedTSNE <- 0    #Random seed for t-SNE
 input_colorGenes <- TRUE   #Color genes in t-SNE plot? 
 tSNEgenePlot()  #Plot genes using t-SNE 

5. PCA and beyond

 input_selectFactors <- 'Gravity'   #Factor coded by color
 input_selectFactors2 <- 'Tissue'   #Factor coded by shape
 input_tsneSeed2 <- 0   #Random seed for t-SNE 
 #PCA, MDS and t-SNE plots
 PCAplot()  

 MDSplot() 

 tSNEplot()  

 #Read gene sets for pathway analysis using PGSEA on principal components 
 input_selectGO6 <- 'GOBP' 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO6,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 PCApathway() # Run PGSEA analysis 
## Warning: Package 'KEGG.db' is deprecated and will be removed from Bioconductor
##   version 3.12

 cat( PCA2factor() )   #The correlation between PCs with factors 
## 
##  Correlation between Principal Components (PCs) with factors
## PC1 is correlated with Gravity (p=2.44e-14).

6. DEG1

 input_CountsDEGMethod <- 3 #DESeq2= 3,limma-voom=2,limma-trend=1 
 input_limmaPval <- 0.1 #FDR cutoff
 input_limmaFC <- 2 #Fold-change cutoff
 input_selectModelComprions <- 'Gravity: Microgravity vs. Terrestrial'  #Selected comparisons
 input_selectFactorsModel <- 'Gravity'  #Selected comparisons
 input_selectInteractions <- NULL   #Selected comparisons
 input_selectBlockFactorsModel <- NULL  #Selected comparisons
 factorReferenceLevels.out <- c('Gravity:Microgravity') 

 limma.out <- limma()
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
 DEG.data.out <- DEG.data()
 limma.out$comparisons 
## [1] "Microgravity-Terrestrial"
 input_selectComparisonsVenn = limma.out$comparisons[1:3] # use first three comparisons
 input_UpDownRegulated <- FALSE #Split up and down regulated genes 
 vennPlot() # Venn diagram 

  sigGeneStats() # number of DEGs as figure 

  sigGeneStatsTable() # number of DEGs as table 
##                                       Comparisons   Up Down
## Microgravity-Terrestrial Microgravity-Terrestrial 2820 2816

7. DEG2

 input_selectContrast <- 'Microgravity-Terrestrial' #Selected comparisons 
 selectedHeatmap.data.out <- selectedHeatmap.data()
 selectedHeatmap()   # heatmap for DEGs in selected comparison

 # Save gene lists and data into files
 write.csv( selectedHeatmap.data()$genes, 'heatmap.data.csv') 
 write.csv(DEG.data(),'DEG.data.csv' )
 write(AllGeneListsGMT() ,'AllGeneListsGMT.gmt')
 input_selectGO2 <- 'GOBP'  #Gene set category 
 geneListData.out <- geneListData()  
 volcanoPlot()  

  scatterPlot()  

  MAplot()  

  geneListGOTable.out <- geneListGOTable()  
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO2,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_removeRedudantSets <- TRUE   #Remove highly redundant gene sets? 
 results <- geneListGO()  #Enrichment analysis
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Direction adj.Pval nGenes Pathways
Down regulated 6.7e-49 176 Cell wall organization or biogenesis
1.6e-48 351 Cell communication
2.1e-48 291 Oxidation-reduction process
6.7e-47 322 Signaling
8.2e-46 151 External encapsulating structure organization
9.4e-46 317 Signal transduction
2.0e-43 143 Cell wall organization
1.4e-40 214 Carbohydrate metabolic process
3.2e-39 225 Transmembrane transport
2.9e-36 286 Response to hormone
Up regulated 5.9e-156 189 Photosynthesis
4.6e-112 173 Plastid organization
4.3e-99 110 Photosynthesis, light reaction
4.2e-83 131 Chloroplast organization
4.6e-81 438 Response to abiotic stimulus
7.8e-79 345 Oxidation-reduction process
3.4e-69 373 Small molecule metabolic process
1.0e-63 164 Generation of precursor metabolites and energy
3.1e-53 191 Response to light stimulus
6.0e-51 192 Response to radiation

STRING-db API access. We need to find the taxonomy id of your species, this used by STRING. First we try to guess the ID based on iDEP’s database. Users can also skip this step and assign NCBI taxonomy id directly by findTaxonomyID.out = 10090 # mouse 10090, human 9606 etc.

 STRING10_species = read.csv(STRING10_speciesFile)  
 ix = grep('Arabidopsis thaliana', STRING10_species$official_name ) 
 findTaxonomyID.out <- STRING10_species[ix,1] # find taxonomyID
 findTaxonomyID.out  
## [1] 3702

Enrichment analysis using STRING

  STRINGdb_geneList.out <- STRINGdb_geneList() #convert gene lists
## Warning:  we couldn't map to STRING 0% of your identifiers
 input_STRINGdbGO <- 'Process'  #'Process', 'Component', 'Function', 'KEGG', 'Pfam', 'InterPro' 
 results <- stringDB_GO_enrichmentData()  # enrichment using STRING 
## Warning in string_db$get_enrichment(ids, category = input_STRINGdbGO, methodMT =
## "fdr", : methodMT parameter is depecated. Only FDR correction is available.
## Warning in string_db$get_enrichment(ids, category = input_STRINGdbGO, methodMT =
## "fdr", : iea parameter is deprecated.
## [1] "Process"
## Warning in string_db$get_enrichment(ids, category = input_STRINGdbGO, methodMT =
## "fdr", : methodMT parameter is depecated. Only FDR correction is available.

## Warning in string_db$get_enrichment(ids, category = input_STRINGdbGO, methodMT =
## "fdr", : iea parameter is deprecated.
## [1] "Process"
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
“No significant enrichment found.” adj.Pval
No significant enrichment found. NULL

PPI network retrieval and analysis

 input_nGenesPPI <- 100 #Number of top genes for PPI retrieval and analysis 
 stringDB_network1(1) #Show PPI network 

Generating interactive PPI

 write(stringDB_network_link(), 'PPI_results.html') # write results to html file 
## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")
## Warning:  we couldn't map to STRING 0% of your identifiers
## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")

## Warning: 'string_db$get_link' is deprecated.
## Use 'Contact developers to request functionality' instead.
## See help("Deprecated")
 browseURL('PPI_results.html') # open in browser 

8. Pathway analysis

 input_selectContrast1 <- 'Microgravity-Terrestrial'    #select Comparison 
 #input_selectContrast1 = limma.out$comparisons[3] # manually set
 input_selectGO <- 'GOBP'   #Gene set category 
 #input_selectGO='custom' # if custom gmt file
 input_minSetSize <- 15 #Min size for gene set
 input_maxSetSize <- 2000   #Max size for gene set 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_pathwayPvalCutoff <- 0.2 #FDR cutoff
 input_nPathwayShow <- 30   #Top pathways to show
 input_absoluteFold <- FALSE    #Use absolute values of fold-change?
 input_GenePvalCutoff <- 1  #FDR to remove genes 

 input_pathwayMethod = 1  # 1  GAGE
 gagePathwayData.out <- gagePathwayData()  # pathway analysis using GAGE  
   
 results <- gagePathwayData.out  #Enrichment analysis for k-Means clusters  
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Direction GAGE analysis: Microgravity vs Terrestrial statistic Genes adj.Pval
Down Root development -6.3445 438 1.8e-07
Root system development -6.3339 439 1.8e-07
Root morphogenesis -5.8486 229 3.1e-06
External encapsulating structure organization -5.1164 435 9.1e-05
Cell wall organization -5.0641 412 9.6e-05
Up Photosynthesis 15.5205 222 2.8e-40
Photosynthesis, light reaction 13.3132 119 1.1e-27
Plastid organization 10.2187 256 6.4e-20
Photosynthetic electron transport chain 10.0483 46 4.2e-14
Photosynthesis, light harvesting in photosystem I 10.0345 16 3.8e-08
Photosynthesis, light harvesting 9.6901 31 1.1e-11
Chloroplast organization 8.5912 197 4.2e-14
Generation of precursor metabolites and energy 8.1922 391 1.8e-13
Photosystem II assembly 7.4063 25 2.0e-07
Protein-chromophore linkage 7.2691 39 2.5e-08
Tetrapyrrole metabolic process 6.9298 93 8.0e-09
Chlorophyll metabolic process 6.9246 81 9.4e-09
Porphyrin-containing compound metabolic process 6.8832 92 9.4e-09
Thylakoid membrane organization 6.8321 46 1.2e-07
Chlorophyll biosynthetic process 6.0264 58 1.2e-06
Tetrapyrrole biosynthetic process 6.0236 70 9.1e-07
Response to high light intensity 5.9608 67 1.2e-06
Porphyrin-containing compound biosynthetic process 5.849 67 1.9e-06
Response to light intensity 5.6113 132 2.6e-06
Electron transport chain 5.5453 177 2.8e-06
Cofactor biosynthetic process 5.384 283 4.8e-06
Plastid membrane organization 5.3788 49 2.2e-05
Regulation of photosynthesis 5.2529 37 5.7e-05
NcRNA metabolic process 4.9137 425 4.6e-05
Response to temperature stimulus 4.7745 490 7.8e-05
 pathwayListData.out = pathwayListData() 
 enrichmentPlot(pathwayListData.out, 25  ) 

  enrichmentNetwork(pathwayListData.out )  

  enrichmentNetworkPlotly(pathwayListData.out) 

## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
 input_pathwayMethod = 3  # 1  fgsea 
 fgseaPathwayData.out <- fgseaPathwayData() #Pathway analysis using fgsea 
## Warning in fgsea(pathways = gmt, stats = fold, minSize = input_minSetSize, :
## You are trying to run fgseaSimple. It is recommended to use fgseaMultilevel. To
## run fgseaMultilevel, you need to remove the nperm argument in the fgsea function
## call.
 results <- fgseaPathwayData.out  #Enrichment analysis for k-Means clusters 
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
Direction GSEA analysis: Microgravity vs Terrestrial NES Genes adj.Pval
Up Photosynthesis 3.0162 222 9.2e-03
Photosynthesis, light reaction 2.9021 119 7.8e-03
Photosynthetic electron transport chain 2.6234 46 7.2e-03
Plastid organization 2.5596 256 9.9e-03
Porphyrin-containing compound metabolic process 2.5016 92 7.4e-03
Chlorophyll metabolic process 2.4886 81 7.4e-03
Tetrapyrrole metabolic process 2.4822 93 7.4e-03
Photosynthesis, light harvesting 2.4743 31 7.2e-03
Chlorophyll biosynthetic process 2.4616 58 7.3e-03
Porphyrin-containing compound biosynthetic process 2.4614 67 7.4e-03
Generation of precursor metabolites and energy 2.4504 391 1.3e-02
Chloroplast organization 2.4447 197 8.9e-03
Protein-chromophore linkage 2.4431 39 7.2e-03
Response to high light intensity 2.4328 67 7.4e-03
Tetrapyrrole biosynthetic process 2.4112 70 7.4e-03
Thylakoid membrane organization 2.3774 46 7.2e-03
Electron transport chain 2.3751 177 8.5e-03
Photosystem II assembly 2.2991 25 7.2e-03
Regulation of photosynthesis 2.2822 37 7.2e-03
Response to light intensity 2.2579 132 7.9e-03
Translation 2.2516 617 2.4e-02
Protein peptidyl-prolyl isomerization 2.2495 55 7.3e-03
Peptide biosynthetic process 2.2467 620 2.4e-02
Regulation of generation of precursor metabolites and energy 2.1869 27 7.2e-03
Plastid membrane organization 2.1865 49 7.2e-03
Photosynthesis, light harvesting in photosystem I 2.1745 16 7.2e-03
Protein targeting to chloroplast 2.1741 43 7.2e-03
Establishment of protein localization to chloroplast 2.1741 43 7.2e-03
Protein localization to chloroplast 2.173 45 7.2e-03
Amide biosynthetic process 2.1635 681 2.6e-02
  pathwayListData.out = pathwayListData() 
 enrichmentPlot(pathwayListData.out, 25  ) 

  enrichmentNetwork(pathwayListData.out )  

  enrichmentNetworkPlotly(pathwayListData.out) 

   PGSEAplot() # pathway analysis using PGSEA 
## 
## Computing P values using ANOVA

9. Chromosome

 input_selectContrast2 <- 'Microgravity-Terrestrial'    #select Comparison 
 #input_selectContrast2 = limma.out$comparisons[3] # manually set
 input_limmaPvalViz <- 0.1  #FDR to filter genes
 input_limmaFCViz <- 2  #FDR to filter genes 
 genomePlotly() # shows fold-changes on the genome 
## Warning in eval(quote(list(...)), env): NAs introduced by coercion
## Warning in genomePlotly(): NAs introduced by coercion

10. Biclustering

 input_nGenesBiclust <- 1000    #Top genes for biclustering
 input_biclustMethod <- 'BCCC()'    #Method: 'BCCC', 'QUBIC', 'runibic' ... 
 biclustering.out = biclustering()  # run analysis

 input_selectBicluster <- 1 #select a cluster 
 biclustHeatmap()   # heatmap for selected cluster 

 input_selectGO4 <- 'GOBP'  #Gene set category 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO4,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  )  
 results <- geneListBclustGO()  #Enrichment analysis for k-Means clusters   
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
adj.Pval Genes Pathways
3.7e-74 199 Response to abiotic stimulus
4.3e-59 162 Organonitrogen compound biosynthetic process
1.1e-56 119 Response to inorganic substance
2.8e-47 149 Small molecule metabolic process
4.7e-41 84 Cofactor metabolic process
2.5e-40 125 Oxidation-reduction process
6.0e-40 61 Plastid organization
9.6e-40 73 Response to metal ion
2.6e-39 64 Response to cadmium ion
4.0e-37 141 Response to organic substance

11. Co-expression network

 input_mySoftPower <- 5 #SoftPower to cutoff
 input_nGenesNetwork <- 1000    #Number of top genes
 input_minModuleSize <- 20  #Module size minimum 
 wgcna.out = wgcna()   # run WGCNA  
## Warning: executing %dopar% sequentially: no parallel backend registered
##    Power SFT.R.sq  slope truncated.R.sq mean.k. median.k. max.k.
## 1      1   0.5090 1.8500          0.770     880       930    937
## 2      2   0.3240 0.9140          0.742     814       887    900
## 3      3   0.2380 0.6800          0.768     766       854    872
## 4      4   0.1630 0.4900          0.725     728       826    850
## 5      5   0.1270 0.4040          0.723     697       802    831
## 6      6   0.1050 0.3650          0.784     670       781    814
## 7      7   0.0852 0.2990          0.766     647       761    800
## 8      8   0.0996 0.3450          0.862     626       743    786
## 9      9   0.0683 0.2750          0.851     607       727    774
## 10    10   0.0590 0.2380          0.831     590       711    763
## 11    12   0.0498 0.2110          0.858     560       683    743
## 12    14   0.0360 0.1710          0.837     534       657    725
## 13    16   0.0213 0.1160          0.814     511       633    708
## 14    18   0.0203 0.1130          0.827     491       612    693
## 15    20   0.0127 0.0871          0.808     472       592    680
## TOM calculation: adjacency..
## ..will not use multithreading.
##  Fraction of slow calculations: 0.000000
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.
 softPower()  # soft power curve 

  modulePlot()  # plot modules  

  listWGCNA.Modules.out = listWGCNA.Modules() #modules
 input_selectGO5 <- 'GOBP'  #Gene set category 
 # Read pathway data again 
 GeneSets.out <-readGeneSets( geneSetFile,
    convertedData.out, input_selectGO5,input_selectOrg,
    c(input_minSetSize, input_maxSetSize)  ) 
 input_selectWGCNA.Module <- '1. turquoise (999 genes)' #Select a module
 input_topGenesNetwork <- 10    #SoftPower to cutoff
 input_edgeThreshold <- 0.4 #Number of top genes 
 moduleNetwork()    # show network of top genes in selected module
##  softConnectivity: FYI: connecitivty of genes with less than 4 valid samples will be returned as NA.
##  ..calculating connectivities..

 input_removeRedudantSets <- TRUE   #Remove redundant gene sets 
 results <- networkModuleGO()  #Enrichment analysis of selected module
 results$adj.Pval <- format( results$adj.Pval,digits=3 )
 kable( results, row.names=FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  scroll_box(width = "100%") 
adj.Pval Genes Pathways
1.2e-115 121 Photosynthesis
7.5e-114 284 Response to abiotic stimulus
1.0e-78 200 Oxidation-reduction process
5.5e-77 160 Response to inorganic substance
3.1e-75 216 Small molecule metabolic process
1.1e-64 122 Cofactor metabolic process
2.9e-63 106 Generation of precursor metabolites and energy
5.4e-62 194 Organonitrogen compound biosynthetic process
1.6e-58 63 Photosynthesis, light reaction
2.9e-56 77 Response to cytokinin